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Conversations on Contextuality 


Ehtibar N. Dzhafarov Janne V. Kujala 

Purdue University University of Jyvaskyla 


Dramatis personae: 

Expositor, trying to present and clarify the main points of a certain view of 
contextuality. 

Interlocutor, skeptical but constructive. 

Authors, supportive of Expositor but sympathetic to Interlocutor (remain 
off-stage except for occasionally inserting footnotes). 


Conversation 1 

Expositor: My dear Interlocutor, as we have agreed, we will discuss a certain 
approach to probabilistic contextuality. Its authors call it, perhaps not too 
descriptively, the Contextuality-by-Default theory (CbD)LjI think I should 
begin by giving you an informal overview of the main ideas. 

Interlocutor: My dear Expositor, I always find an informal presentation of 
ideas a dubious exercise. If I do not understand the presentation clearly 
(which happens often), it is never clear to me whether this is because it was 
dumbed down so much as to become deficient of information, or because the 
ideas themselves are deficient. Nevertheless I should let you proceed. 

EXP: Let me try. Objects (or things, or properties — choose what you like) are 
measured under varying conditions, called contexts. The measurements are 
generally random variables, and their identity is defined by what is measured 
(object) and under what conditions it is measured (context). As a result, 
the same object measured in different contexts is represented by different 
random variables: it is meaningless to ask why they are different (hence the 
designation “contextuality-by-default”). Moreover, measurements made in 
different contexts, whether of the same object or of different objects, do not 

1 Aut: The term indeed may not have been optimally chosen. It may suggest that all 
systems of measurements are contextual until proven otherwise. This is not true. The “by¬ 
default” in the name of the theory refers to the identification of the measurements as random 
variables: their identity always (“by default”) depends on context. 
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have a joint distribution. One cannot, e.g., speak of their correlation or of 
the probability with which they have the same value. All measurements 
made within one and the same context, however, are jointly distributed. 
The overall picture we have therefore is one of stochastically unrelated to 
each other islands of jointly distributed measurements (“bunches of random 
variables”). Is this sufficiently clear? 

Int: I am not sure. How does one define “objects” and “contexts”? 

EXP: Primitives of a theory cannot be explained conceptually except in their re¬ 
lations to other primitives of the theory, and their operational meaning may 
be outside the theory. The “objects” and “contexts” are such primitives: 
formally, they are no more than labels defining the identity of a measure¬ 
ment (so that each measurement is defined by two labels, one for “what” 
and another for “in what context”). 

Int: Perhaps we could clarify this with examples. 

EXP: Here is an example. Suppose we pose two Yes/No questions to randomly 
chosen people and record their responses. The questions can be asked ver¬ 
bally or presented in writing. Intuitively, a question asked is the “object” be¬ 
ing measured, the presentation mode (verbal or written) is context, and the 
response to a given question by a randomly chosen person is measurement, 
a random variable labeled by the question asked and by its presentation 
mode. 

Int: So we have four random variables, if I understood you correctly: response 
to question A presented verbally, response to question A presented in writ¬ 
ing, and analogously for the second question, B. 

EXP: Yes. Let me suggest notation for these four random variables: 

jdV jdW tdV tdW 
n Ai n A > n B 5 11 B ? 

or, better still. 

( nV tdV\ (jdW jdW\ 

\H a ,Bb) 1 \^A > -Tb ) ■ 

R stands for response (whose values can be Yes or No), the subscript shows 
the object (question), and the superscript shows the context (presentation 
mode). We record R\ and R together because the two questions are posed 
to a same person. For the same reason, R^ and R^ are recorded together. 
Therefore {R a , A))) are jointly distributed, and so are [R' a ,R'^). This 
accords with the general rule: random variables recorded in the same context 
are jointly distributed. Now, we assume that a person is asked either two 
written questions or two verbal questions. So, R A is never recorded together 
with (which means here, never obtained from the same person as) or 
Therefore such pairs as R\, R^ or R\, have no joint distribution. 

Int: What does this mean exactly, not to have a joint distribution? 

EXP: It is clear that for each of our four variables we have well-defined prob¬ 
abilities with which their value is Yes: this is the probability with which a 
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randomly chosen person will respond Yes to the corresponding question in 
the corresponding context. Say, the probability of the event [R\ = Yes] is 
0.4, the probability of [Rg = Yes] is 0.5, and for [R\\ = Yes] the probabil¬ 
ity is, say, 0.7. All these probabilities are well-defined theoretically and can 
be estimated empirically. Since we ask A and B in the verbal mode together 
(from a same person), we can also define and estimate the joint probability 
of [R\ = Yes and Rg = Yes]. For instance, if it equals 0.4 x 0.5, then the 
two random variables are independent; if it equals 0.4, they are maximally 
positively correlated, etc. 

Int: I see. And, I understand, the situation is different with R\,R^ or 
Rg,Rg, because there is no meaning in which one could define “and” in, 
say, [Rg = Yes and Rg = Yes]. 

ExP: Precisely. The probability of this conjunction is undefined and cannot 
be estimated empirically. So the general rule is: no two random variables 
recorded in different (mutually exclusive) contexts possess a joint distri¬ 
bution. Let’s agree to call such variables stochastically unrelated, (to each 
other). 

Int: What if I modified the design of the survey, and asked only one question 
per person, in writing or verbally? Wouldn’t then even R\ and R\ be 
stochastically unrelated? And wouldn’t this contradict our general rules? 

EXP: No combinations of objects and contexts can contradict our general rules 
because these combinations have to be chosen in accordance with these rules. 
In your modified set-up, if we continue to view the questions A and B as 
our sole and distinct objects, then the contexts involve not only the mode 
of presentation but also the identity of the questions themselves: (A,V), 
(B,W), etc. So the random variables we record are 


R 


(A,V) 

X 


7 AA,W) 

> n A > 


R 


(B,V) 


,R 


( B,W ) 
B 


Since no two of them share a context, they are pairwise stochastically unre¬ 
lated. 

Int: But how do I know which of the representations to use, R\ or R^ ,V ^? 

EXP: First you have to decide (outside the CbD theory) on the empirical mean¬ 
ing of “co-occurrence” or “occurrence together” in your study. In our exam¬ 
ples you consider questions posed to one and the same person (and responses 
obtained from one and the same person) as co-occurring. You also know the 
rules, so you always use different contexts, whatever your choice of the labels 
for them, for the measurements that do not co-occur. Suppose I simplify 
your design by forgetting about the presentation inodes. I ask one of two 
questions, A or B , of randomly chosen people. The objects being measured 
then are A and B again, and we know that the measurements of these ob¬ 
jects never co-occur, hence they are stochastically unrelated. Therefore the 
questions themselves (or any two labels corresponding to them one-to-one) 
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are the contexts here, 


R 


A 
A i 


R 


B 

B- 


Int: I think I now understand the notion of stochastic unrelatedness and your 
notation. But couldn’t we also say in all such cases that the two random 
variables, e.g., R^,R B , are stochastically independent? 

EXP: This would be a common way of thinking of this situation. But it is 
incorrect. One could only say, with some caution, that they can always 
be treated as if they were independent. We will get to this later, when 
we consider the notion of a coupling. You can, however, appreciate the 
difference between the situation when a randomly chosen person is being 
asked one of two questions, A or B , and the situation when a randomly 
chosen person is being asked both these questions, A and B. The two 
responses in the latter case are random variables 


jointly distributed: one can define and estimate the probability of the joint 


event 


R[a' B ' 1 = Yes and R^' B ^ = Yes 


Suppose we find out that this 


probability equals the product of the probability of 


R 


(A,B) 


= Yes 


and the 


probability of R [ B ' = Yes taken separately. Then we say that the ran¬ 

dom variables R^’ B ^ and R^ B ' B ^ are stochastically independent. This is 
fundamentally different from the situation with R^ and R B , when the joint 
event [R% = Yes and R B = Yes] is simply undefined!! 


Int: I see the difference. I still have questions about the objects and contexts, 
but I think I should allow you to continue your presentation of CbD. The 
double-notation, I understand, is only a departure point. 


Exp: Yes, it is. And we will indeed continue to address your misgivings as 
we proceed. By now we have established a certain picture of a system of 
measurements: it consists of stochastically unrelated to each other islands 
(or bunches) of jointly distributed random variables. The main idea of CbD 
is that these isolated bunches can be characterized by exploring all possi¬ 
ble couplings thereof (or all possible joint distributions imposable on them) 
under well-chosen constraints. 


Int: What is a coupling ? 

EXP: A coupling for stochastically unrelated random variables X,Y,..., Z is 
a jointly distributed (^X. Y, ..., Z^j in which X has the same distribution 

as X, Y has the same distribution as Y, and so onH For instance, in our 

2 Aut: For a detailed discus sion, see Dzhafarov and Kujala (2014c) and Dzhafarov (120151 ). 
3 Aut: See iThor isson ( 200(1'). A traditional definition of a coupling does not require that 
the random variables being coupled be stochastically unrelated, but in the present context it 
is the only application. 
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example with R‘\ and Rg, let the distributions of these random variables 
be defined by (with Pr standing for probability) 

Pr [R\ = Yes] = 0.4, Pr [R% = Yes] = 0.7. 

The measurements R/\ and Rg are stochastically unrelated, so Pr [/?(] = Yes and Rg = Yes] 
is undefined. To couple them means to create a new pair of random vari¬ 
ables, R \4 and Rg , that are distributional copies of, respectively, R'\ and 
Rg but are jointly distributed. That is, 

Pr \r\ = Yesj = 0.4, Pr \r% = Yes] = 0.7, 

but, unlike the “originals” R:^ and Rg, their distributional copies have a 
well-defined joint probability, 

Pr = Yes and Rg = Yes . 

It can be shown that this probability can have any value from 0.1 (maximally 
negative relation) to 0.4 (maximally positive relation). The independent 
coupling, with this probability equal to 0.4 x 0.7, is within this range. 

Int: And any of these values will define a pair ^ R^, Rg^j that is a coupling for 
R'\ and Rg ? 

ExP: Yes. The number of all possible couplings for a given pair of random 
variables is typically infinite. For binary X,Y (say, Yes/No ones) the only 
exceptions are the pairs with Pr [X = Yes] or Pr \Y = Yes] having the values 
0 or 1. For such a pair only one coupling is possible. 

Int: Let me try an example with continuous distributions. Take random vari¬ 
ables R and S that are standard normally distributed. Then any bivariate 
normally distributed (^R, s'j with standard normal marginals is a coupling 
of R and S. 

EXP: Yes, and there are other couplings for these R and S too: it can be 
any (^R, S^j whose joint distribution is well-defined, and whose individual 
(marginal) distributions are standard normal. 

Int: Just so that we don’t focus on pairs of random variables exclusively, what 
would be a coupling for the four random variables R\, R ^, Rg, Rg in our 
original example? There we had two pairs (you called them “bunches”) with 
jointly distributed components, (R\,Rg) and (R ™, Rg) ■ 

EXP: Because of the latter, this example too can be presented as involving just 
pairs: (R\,Rg) is a random variable too, unless we use the term very nar¬ 
rowly. But this is not critical: it is the same thing to seek a coupling for 
(R V a,R V b) and (R%,R%) , each with a known distribution, and to seek a cou¬ 
pling for R\, /?]] , R^, R^ in which you know the distributions of (R^,Rg) 
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and (R l 4 ■ Af) ■ The joint distribution of (R\,Rg) is determined by four 
probabilities 

(R\,R^): (Yes, Yes) (Yes, No) (No, Yes) (No, No) 
probability: p YY Pyn Pny Pnn 

with the probabilities summing to 1, of course. And the joint distribution 
of «,<) is determined analogously, 

(R'XtRW): (Yes, Yes) (Yes, No) (No, Yes) (No, No) 
probability: q YY q Y w qwv qwN 


To couple these pairs (equivalently, to couple all four random variables 
R^, R}%, Rb, R^b) means to create a quadruple (^RX, R^, R^, R's^j with 
jointly distributed components such that the distributions of the pairs [r\, R \j 

and (^RajRb ^J are the same as those of (R\,R'b) and respec¬ 

tively. For instance, 


Pr 


RX = Yes and RX = No 


= Pyn 


Pr 


RXX = No and RX = No 


= qNN, etc. 


Int: And to create a quadruple yR\, R^ , R,g, Rb) with jointly distributed 
components means ... 

EXP: It means to assign to each of the 16 possible quadruples of values a 
probability value: 

(RX, Rj, RX, R%) : (Yes, Yes, Yes, Yes) (Yes, Yes, Yes, No) ... (No, No, No, No) 
probability : tyyyy cyyyn • • • ‘J'nnnn 

To form a coupling, these probabilities should agree with the observed p and q 
values: e.g., 

r Y iNj=PYN, riNjN=qNN- 

i,j'G{Yes,No} i,j£{Yes,No} 


Int: And, of course, this can generally be done in an infinite number of ways. 
I think it is clear. You said earlier that you wanted to characterize the 
isolated bunches of random variables by exploring all possible couplings of 
these bunches under well-chosen constraints. Tell me now what you mean 
by the “well-chosen constraints” for couplings. 

ExP: The “well-chosen constraints” depend on (and determine) the aspect of 
the system of unrelated to each other bunches you want to characterize. If we 
are interested in contextuality, the idea (arguably, the most original idea in 
the CbD approach) is to look for couplings in which the measurements of one 
and the same object under different conditions are equal to each other with 
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as high a probability as possible. Contextuality is determined by computing 
these highest probabilities for each object in isolation and then determining 
if they are compatible with the observed bunches of measurements. 

Int: I assume you want to elaborate. 

EXP: I will. But I think we will relegate this to our next conversation. 


Conversation 2 


EXP: My dear Interlocutor, you asked me to elaborate what I said about the 
measurements of an object in different contexts being equal to each other 
with as high a probability as possible. First of all, let me emphasize that the 
probability of being equal to each other applies to the couplings rather than 
the coupled random variables themselves. When I find a bivariate-normally 
distributed coupling (^R, S^j for standard normally distributed R and S', I 
do not make R and S jointly distributed, I merely create jointly distributed 
“copies” of R and S. And there are generally an infinite set of such cou¬ 
plings. jkmong them there is one coupling, with the correlation between 
R and S equal to 1 (defining a degenerate bivariate-normal distribution), 


in which Pr 


R = S 


has the highest possible value (in this case, 1). It is 
called a maximal coupling. If R and S are normally distributed but with 
different means and/or variances, then the maximal couplings still exist, but 

not among bivariate normally distributed ^ R , s'j , and the highest possible 


probability for 


R = S 


is less than 1. 


Int: Let me switch back to our original example to understand this. In 
R v a ,RJ,R v b ,R^ we have R\ and R^ measuring the same object A in 

two contexts. We take them and look for a coupling (r Ai R a ''j for them, 
forgetting for the time being all about the remaining two variables. We 
ask the question: what is the maximal possible probability for the event 
R\ = RW ? We know that, by definition of a coupling, 


Pr 


BX = 1 


= Pr [RV = 1] = p v, 


Pr 


Ra= 1 


= Pl[R w = i\ =p w_ 


I assume this allows me to compute the maximal possible value for Pr 
Exp: Yes. This maximal value is 1 — I p\ — I. It is very easy to prove 0 


R v a=K 


4 Aut: See, e.g., [Dzhafarov et al .I d2015at) . 
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Int: I will take you word for it. So we have 


max Pr 

all couplings 

(SX.S5T) 


pV _ 

rC A — K a 


= l- 


I Pa 


w i 

■Pa 


After that I forget all about R A and R A and focus on Rg and Rg , the 
other two measurements of one and the same object in two contexts. By 
analogy, 

max Pr 

all couplings 

How does one proceed from here? 


taV _ nW 


-t I V W I 

= 1 - \Pb - Pb ■ 


EXP: Now we have to take all four our random variables and construct a cou¬ 
pling R^a, R'b, Rg^j for them. We have already discussed how we do 
this. Except in special cases, there is an infinity of such couplings. What we 
are now interested in is whether among all these couplings there is at least 
one in which 


Pr 


tdV _ T^W 

Ra ~ *' A 


W I 


= 1 -\P V A-Pl 


and 


Pr 


Rl = R% 


i I V W I 

= 1 - \Pb - Pb I • 


If the answer to this question is affirmative, then we say that the system 
of measurements, in this case comprised of (R^,Rg) and (R 1 ^, R'g) , is 
noncontextual. If there is no such couplings, then the system is contextual. 


Int: Let me first see how this works if 


v W a v w 
Pa = Pa and Pb=Pb ■ 


w 


The maximal probabilities of 


R v a = RJ 


and of 


R v b = R% 


then are equal 


to 1, for both A and B. So, if I can find a coupling (^R Al R A , R^g, Rg^j in 

which R\ is always the same as R^ and R\ 3 is always the same as Rg , 
then the system is noncontextual. 


Exp: This is in fact the traditional understanding of contextuality (expressed 
in the language of CbD)[f| If R V A and Rg are always the same, one can say 
that the measurement of A does not depend on what the context is, V or 
W ; and analogously for Rg and Rg . 


Int: The adjective “noncontextual” here seems intuitive to me. Let me now con¬ 
sider the case when p v A ^ , i.e., the maximal probability 1 — | p\ — p v A \ < 

1. So we begin by computing ... But wait: the measurements of A here have 
different distributions in context V and in context W, so these measurement 

5 Aut: See iDzhafarov and Kuialal ll2014al lcl). 
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are context-dependent. Why don’t we declare this system of measurements 
contextual, without computing anything else? 

ExP: You are touching on a subtle conceptual and terminological issue. Nothing 
prevents one from calling this system contextual, but in the terminology of 
CbD it is called inconsistently connected (and if p\ = p l J and p ^ = p ^, the 
system is consistently connected). If you insist on using the word “contextu- 
ality” whenever a system has p\ ^ p l J or p ^ ^ p ^, call this “contextuality- 
1.” Or use another qualifier, but distinguish this form of contextuality from 
the contextuality in the sense of CbD (call it “contextuality-2” if you like). 
It is the form of contextuality that may exist on top of the “contextuality-1.” 


Int: I still have misgivings, but we can return to this later. Let me resume 
my attempt to understand how this “contextuality- 2 ” works in the case 
p\ 7 ^ P^a and/or p )) ^ p/)\ We compute the maximal probability of 


R v a = RJ 


across all couplings (r a , -R/' j for R\ and R ™; and, sepa¬ 


rately, we compute the maximal probability of 


tdV _ tdW 

K b — ri B 


across all cou¬ 


plings (^Rg,Rg^j for R B and R^. These probabilities, you tell me, are 
1 — | p A — Pa | and 1 — \pb ~ Pb |> respectively. Then we look at all pos¬ 
sible couplings ^R-a, R'a , R-b, for all four of our random variables, 

and for each of them we compute the probabilities of 

RJ = Rb 


R v a =R v b 


and of 


ExP: It is clear that these probabilities cannot exceed the values 1 — | p\ — p/' | 
and 1 — | p B — p™ |, respectively — because every sub-coupling (r\ 5 R'a ) °f 

(r v a ,RJ,R v b ,R^) is also one of the possible couplings ^ R\,R a ) f° r R'a 
and Ra taken separately; and analogously for (r b - -Rj/j and (^R B ,R B ^j. 

Int: Yes, I see this. Now, if in some of the couplings (r\, R^, R B , R^^j both 
these probabilities are achieved, then we say the system is noncontextual 
(lacks “contextuality-2”). Otherwise it is contextual. 


EXP: Yes. The intuition here is that there is something in the relationship 
between (R\,R b ) and ( Ra,R b ) that cannot be reduced to the separate 
effects of the context changes on the responses to A and on the responses to 
B. 


Int: Wait, wait. Couldn’t we then simply reformulate the problem by taking 
(A, B) as a single object? Then we would have two stochastically unrelated 
random variables 


tdV 

n (A,B). 


dW 


each with four possible values, (Yes-Yes), (Yes-No), etc. We will have contex¬ 
tuality if their distributions are different. This would be “contextuality-1,” 
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of course. 

EXP: This is definitely a possible approach. I have mentioned already that ob¬ 
jects and contexts are primitives of the CbD theory, which means that the 
theory does not dictate their choice. The only constraint imposed by the 
theory is that the random variables measured in the same context have a 
joint distribution, while random variables in different contexts do not. If you 
choose a single object in two contexts, as you have proposed, you will simply 
be dealing with a different problem. The system comprised of R AA and 

R l (A b) nia y or ma y n °t be inconsistently connected (I suggest we stick to this 
term instead of “contextuality-1” and the like), but irrespective of this, it is 
noncontextual. If it is inconsistently connected, then the system we consid¬ 
ered previously, (R a ,R b ) and (f?)f , R?g ), may or may not be consistently 
connected, and in either case it can be contextual or noncontextual. 

Int: Then I was wrong: inconsistent connectedness in B y Ry A B )j does 

not predict or account for the contextuality in {(R a ,R 1 b ) , (R A , Rg)}. 

EXP: No, it does not. The system {( R\ , R B ) , {R a , R'b) } can be shown to be 
noncontextual if and only if 


|< R v aR v b) - (RJRb) | < \(R\) - (Rj) | + | {R v b) - «} |, 

where (•) is expected valued You can easily verify that this inequality may 
hold or fail with the distributions of Rj A b ^ and R^ A B ~. being different. 

Int: I wonder: even if we get a completely different system by doing this, is it 
always possible to get rid of contextuality in the sense of CbD by redefining 
the objects? 

EXP: The answer to this is yes, but generally not by grouping the objects 
together, as in R^ A B yR^ A B y Consider, e.g., a system involving three ob¬ 
jects q, (/,</' and three contexts c, c',c", combined in the measurements as 
follows 0 

As always, the pairs of random variables labeled by the same context are 
jointly distributed, and different pairs are stochastically unrelated. One 
cannot now put all three objects together as a single object. Instead one 
can do something universally applicable: taking a measurement’s context 
as part of the identity of the measurement’s object. In our case this means 
replacing q in R £ with (g,c), replacing q in R £ with ( q,c "), etc. We will 
get then in place of the system above a new system 


(-ft(g,c)> R(q' ,c)) ) (^R(q’ ,c')> R(q" ,c')) > (-^(g",c") ’ R(q,c")) ' 


6 Aut: This is an example of a cyclic system with n = 2, as defined in Conversation 3. 
7 Aut: This is an example of a cyclic system with n = 3. 
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In this system no two measurements share their object, and the system 
is readily seen as trivially noncontextual (and also consistently connected, 
again in the trivial sense). 

Int: This universal trick then consists in declaring any object in a new context 
to be a new object. For instance, a question presented verbally and the same 
(in content) question presented in writing are different questions. 

EXP: Yes, and there is nothing incorrect about this, at least not from the point 
of view of CbD. It just makes the issue of contextuality uninteresting. 

Int: So we will not use this trick for the sake of keeping our discussion inter¬ 
esting. However, philosophically speaking, it may very well be the case that 
finding a system of measurements contextual means that “the same” objects 
in different contexts are not really the same. 

ExP: Perhaps. But I find such philosophical formulations unsatisfactory. We 
need a language rich enough to lead to interesting classifications and quan¬ 
tifications of contextuality. The language of CbD is rich enough. Trivial 
renaming of all objects into object-contexts is not. 

Int: I agree. I think I understand the definition of contextuality in CbD. 
However, I may need more persuasion to accept it. Let me return to my 
misgivings about “contextuality-1” and “contextuality-2.” Why do we need 
the latter? 

ExP: Let me remind to you that “contextuality-1” is inconsistent connected¬ 
ness: for some objects, their measurements have different distributions in 
different contexts. We may very well have a consistently connected system, 
however, without “contextuality-1.” Will we simply declare it noncontex¬ 
tual since all contextuality is “contextuality-1”? Again, one can say this if 
one so wishes, but this terminology will not change the fact that there is an 
important distinction within the class of consistently connected systems of 
measurements. 

Int: Please remind me what this distinction is. 

EXP: If the distribution of the measurements of a given object is the same 
across all contexts involving this object, then it is possible to couple these 
measurements so that their copies in the coupling are equal to each other 
with probability 1. Let’s call this the identity coupling. Now, there are 
two possibilities. There can be a consistently connected system that has 
a coupling in which this probability 1 is achieved for all objects; in other 
words the identity couplings for different objects can all be put together 
so that they are compatible with the observed distributions of the bunches 
of measurements in each of the contexts. And there can be systems in 
which such couplings do not exist: the observed bunches of measurements 
are not compatible with the identity couplings for all the objects. This is 
an important distinction, and it is captured by calling the systems of the 
latter kind contextual. In quantum physics this distinction is related to such 
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questions as the (non)existence of hidden variables of which all observed 
random variables in an experiment are functions]! 

Int: Yes, I agree this distinction is important. But it seems to me in quantum 
physics the case of consistent connectedness is the main if not the only 
case to consider]! If we retain the traditional definition of contextuality for 
such systems (perhaps in the CbD formulation), do we need to extend it to 
inconsistent connectedness? You said the contextuality in the sense of CbD 
exists “on top of” inconsistent connectedness. Couldn’t we, however, simply 
ignore it? In other words, couldn’t we have a single notion of contextuality, 
which coincides with “contextuality-2” for consistently connected systems 
and with “contextuality-1” otherwise? I have almost asked this question 
before, but we digressed (or at least I see it now as a digression) into the 
discussion of how one can define objects and contexts. 

Exp: Let me think of how to respond to this, and we will return to this in our 
next conversation. 


Conversation 3 


Exp: My dear Interlocutor, to defend a definition is a difficult task. A good def¬ 
inition of a term should be intuitively plausible (although sometimes one’s 
intuition itself should be “educated” to make it plausible), it should include 
as special cases all examples and situations that are traditionally considered 
to fall within the scope of the term, it should lead to productive develop¬ 
ment (to allow one to prove nontrivial theorems), and have a growing set of 
applications. I believe contextuality in the sense of CbD satisfies all these 
desiderata, but I may be unable to discuss them with you comprehensively. 

Int: Let us try intuitive plausibility. 

EXP: One argument I find persuasive is appealing to “small” inconsistencies 
added to consistently connected systems with “large” contextuality. Contex¬ 
tuality in CbD can be rigorously quantified H! but I will only need intuitive 
guidance to present the argument. Consider our system {(Ra,Rb) , 

You may recall that the criterion (necessary and sufficient condition) for 
noncontextuality here is given by the inequality 

\{R v aR v b) - {RaK) | < \{R v a) - (Ra) I + k rV b) - (Rb) I • 

8 Aut: This is, e.g., how the problem was formulated bv I Belli (|l964l ). The possibility 
of reformulating this in terms of the (n on)existence of certa i n cou pling s (without u sing this 
concept explicitly) was realized later, in lSuppes and Zanottil dl98ll ) and |Fin 1 d 19821). 

9 A ut: It may be a prevalent case but definitely not the only one: see, e.g., Bacciagaluppil 
(|2015T ). 

10 Aut: See [Dzhafarov et al\ d2015al) ; iKuiala et al\ d2015l) : iKuiala and Dzhafarovl d2Q15l) : 
Ide Barros et al ■I 120151). 
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Your proposal is to accept this formula only for consistently connected sys¬ 
tems, when 

\(r\)-(K)\ + \(rI)-(r%)\ = o. 

In this case the system is contextual if and only if 
\(RaRb)-(RaRb)\ >0. 

Now, the largest possible value of the last expression is 2, and, I think you 
would agree, it is reasonable to say that the system { (R \, R^ B ) , (R^ , R ^)} 
with this value equal to 2 exhibits the greatest possible degree of contextu- 
ality, given the consistent connectedness. 

Int: Is this maximum value 2 compatible with consistent connectedness? 

EXP: Yes, it is. See these two distributions: 



R% = +1 

R% = -1 


Rj = +l 

0 

1 

2 

! 

RJ = - 1 

1 

2 

0 

1 

2 


1 

1 



2 

2 




Rb — +1 

Rl = -1 


R-a — +i 

1 

2 

0 

B 

R V a = - 1 

0 

1 

2 

1 

2 


1 

1 



2 

2 



The computations yield all the expected values (R\) > (Ra) > (R-b) > {Rb) 
equal to zero, = 1 and ( R^( R^) = — 1. 

Int: I see. Please continue with your example. 

EXP: Let us now introduce a minuscule inconsistency, say, 
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The only difference of this system from the previous one is that now 
(R\) = 2e rather than 0 and = 1 — 2e rather than 1; but £ can be 

chosen arbitrarily small. The system therefore can be made arbitrarily close 
to the previous one. If I continue to follow your proposal, however, I should 
abandon the joint expectations altogether and focus on the marginals only: 
the system is contextual now simply because it is inconsistently connected, 

| (Ra) - (Ra) | + \(Rb) - (Rb) I =2s>0. 

But you would probably agree that if £ is minuscule, the degree of contex- 
tuality in the system is minuscule too, wouldn’t you? 

Int: Indeed I would. And I can guess the rest of your argument too. When 
£ is very small but nonzero, the system has a small degree of contextuality 
(which will be “contextuality-1”). As you make £ smaller and smaller, the 
contextuality gets smaller and smaller. But as soon as £ reaches zero, the 
contextuality jumps from the limiting zero value to the maximal possible 
value (because now it is “contextuality-2”). It is a strange behavior, I should 
admit. 

ExP: Precisely. I conclude that your concept of contextuality is not well-formed. 

If we distinguish inconsistent connectedness from contextuality in accor¬ 
dance with CbD, however, the problem disappears. The degree of contex¬ 
tuality in the second system, if £ is very small, is only slightly smaller than 
the degree of contextuality in the first system. The inequality 

2 = | (R v a R v b ) - (R%R%) | > | {R v a ) - (R%) | + \{R v b ) - «)| = 0 
changes into 

2-2£ = \{R V a R V b) ~(R%R%) I > \(R%) - (R%) I + I (R V b) - «)I = 2£. 

Int: I agree this feature speaks in favor of the CbD concept. Does this mean, 
however, that inconsistent connectedness and contextuality (or “contextuality- 
1 ” and “contextuality-2,” even if you don’t like this terminology) have fun¬ 
damentally different ontologies? 

EXP: It is a question to which I do not have a definitive answer. Inconsis¬ 
tent connectedness in most, if not all cases have trivial and well-understood 
causes: conditions under which measurements are made affect these mea¬ 
surements, either through physical interference or through context-dependent 
measurement biases. In the example with written and verbal questions, read¬ 
ing a question invokes very different psychological processes than hearing it 
asked: there is nothing remarkable in the distributions of responses in the 
two cases being different. Or consider a formally identical but empirically 
different example: replace the presentation mode with order in which the 
two questions are asked, so that instead of V we have context A —> B and 
instead of W the context B —> A. In this case it is natural to expect that 
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the first question affects a person’s response to the second question 1^1 In 
physics, it is common that different contexts correspond to different exper¬ 
imental set-ups, so that one and the same object in different contexts is 
simply measured differentlvF^l 

Int: And you claim that the causes for contextuality are different? 

EXP: At least they may be different, and they are definitely different in some 
cases. Take the famous Alice-Bob experiment (more formally known as the 
EPR/Bohm or EPR/Bell paradigm), in which Alice measures spins in a 
particle 1 and Bob measures spins in a particle 2. The two particles are 
entangled, meaning that they were created in a special, singular state that 
makes Alice’s and Bob’s spins that are measured in the opposite directions 
perfectly correlated. Alice and Bob make their measurements simultane¬ 
ously from some Charlie’s point of view, and this precludes any information 
traveling from Alice to Bob or vice versa. Nevertheless, we have a clear 
case of contextuality (“contextuality-2”) in this casein If we modify this 
experiment so that Alice and Bob (from Charlie’s point of view) make their 
measurement with an interval between them that allows for signaling, and if 
we assume that some form of signaling is indeed effected, then we may have 
distributions of Alice’s measurement depending on Bob’s settings and/or 
vice versa. This would be inconsistent connectedness. But contextuality 
may still be measurable “on top of” this inconsistency!^ 

Int: But you don’t know if contextuality is always so different in nature from 
inconsistent connectedness, do you? 

EXP: No, which is why I said I did not have a definitive answer. 

Int: Do we know if contextuality of the kind we find in quantum physics also 
exists in non-physical systems? Perhaps in human behavior? 

EXP: No, we don’t. A recent analysis of available experimental data in psy¬ 
chology seems to suggest that the answer is negative^ But we can’t know 
for sure, because in psychology we lack a theory analogous to quantum me¬ 
chanics and have to grope in darkness trying now this and then that. 

Int: So it is possible that contextuality only exists in quantum physics? This 
would be disappointing, wouldn’t it? 

EXP: Not necessarily. Lack of contextuality, if it can be formulated as a general 
principle in some domain, allows us to predict outcomes of experiments, or at 
least predict what outcomes are not possible. In psychology this hypothetical 
principle would, in a sense, create a general theory that we otherwise lack. 

11 Aut: See lMoorel ll2003i : I Wang et ali ll2014h ; IDzhafarov et al\ |2015 dV 

12 A ut: See the discussion of an experiment bv 1 1 .ankiewkT" et al . i lioiil'i in iKuiala et al\ 

d2015l) . 

13 Aut: See, e.g., IDzhafarov and K uialal d2014bh . 

14 Aut: See the chapter by Ku jala a nd Dzhafarov in this volume. 

15 Aut: See IDzhafarov et ali d2015cT) . 
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Int: I see. Well, we have covered a lot of ground in our conversations. Would 
you like to summarize? 

EXP: I could summarize the definition of contextuality in a more formal way. I 
recall you did not like informal presentations. 

Int: Please do. 

EXP: The primitive concepts of the theory anF^l 

1. set Q of “objects being measured,” 

2 . set C of “contexts of measurements,” 

3. relation “object q is measured in context c,” q < c, and 

4. set of “measurements,” random variables R= {R q : q < c }^q- 

Two postulates of the theory are 


1. for a given context c G C, the set of measurements R c = {R q : q < c } ?g g 
is a random variable (which means the measurements are jointly dis¬ 
tributed); 

2. any two measurements R ql R c q , with c / c' are stochastically unrelated 
(whether q = q' or not). 


We call the set R q = {R q : q<c} ceC a connection for object q. The el¬ 
ements of a connection are pairwise stochastically unrelated. Let T q = 
{T q : q < c} ceC be a coupling for connection R q , i.e., T' q has the same dis¬ 
tribution as R q for all c such that q < c. This coupling is called maximal if 


for any c, d € C, T q = XT 


is maximal across all possible couplings for 


Pr 

R q . Such a coupling exist, and the maximal probability in question, p q , is 
uniquely defined. 


Let S = {S q : q< c }^ g g be a coupling for the system R , i.e., S c = {S'J : q < c} 
is distributed as R c = {R q : q -< c } q£ Q f° r a H q < c. This coupling is called 
maximally connected if 


Pr 


for any c, d G C, 



= Pq 


for every q £ Q. If R has a maximally connected coupling it is noncontextual. 
If R does not have a maximally connected coupling it is contextual. 

Int: Do we know criteria of (non)contextuality analogous to the one you men¬ 
tioned before, for the system , (^R^, R^)}? 

16 A u t: For details se c Dzhafarov et al. <2015a'): lKuiala et all ||20 1 4) : iKuiala and Dzhafarovl 
(120151 1 : Tde Barros et al . I ll2015f) . 
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EXP: Yes, this was a special (in fact, simplest) example of a broad class of 
systems for which we have criteria of (non)contextuality. These systems are 
called cyclic , and they are defined as follows: for some n > 1 , 

1. the set of objects is Q = { q\ ,..., q n }, 

2 . the set of contexts is C = {ci,..., c n }, 

3. for each object q there are precisely two contexts c, d such that q<c and 
q < c', 

4. for each context c there are precisely two objects g, q' such that q < c and 
q' < c, 

5. all Rq with q < c are binary, ±1. 

By appropriate enumeration we can always achieve a cyclic structure: g.j <Ci 
and gj 0 i -< Ci for i = 1 ,..., n (where © is cyclic increment, with n © 1 = 1 ). 
The system is noncontextual if and only if 

n n 

od S ber E (± <^®i» < (« - 2) + E K^ei) - ( r Ti)\ > 

of minuses i= 1 i— 1 

where each ± is replaced with + or — so that 

the minus is chosen an odd number of times (1, 3,...). 

Int: This looks like a good point at which to adjourn, my dear Expositor. I 
have to think this all over. I am sure I will come up with more questions 
and misgivings. 

EXP: I am sure you will, my dear Interlocutor. Until then, good bveFI 
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